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Abstract: We present an implementation of the calculation of the production of W + W + 
plus two jets at hadron colliders, at next-to-leading order (NLO) in QCD, in the POWHEG 
framework, which is a method that allows the interfacing of NLO calculations to shower 
Monte Carlo programs. This is the first 2—7-4 process to be described to NLO accuracy 
within a shower Monte Carlo framework. The implementation was built within the POWHEG 
BOX package. We discuss a few technical improvements that were needed in the POWHEG 
BOX to deal with the computer intensive nature of the NLO calculation, and argue that 
further improvements are possible, so that the method can match the complexity that is 
reached today in NLO calculations. We have interfaced our POWHEG implementation with 
PYTHIA and HERWIG, and present some phenomenological results, discussing similarities and 
differences between the pure NLO and the POWHEG+PYTHIA calculation both for inclusive 
and more exclusive distributions. We have made the relevant code available at the POWHEG 
BOX web site. 
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1. Introduction 

With the increase in energy and luminosity of the LHC, accurate predictions for high- 
multiplicity processes become necessary. A lot of effort has been devoted in recent years 
towards the calculation of next-to-leading (NLO) corrections to various 2 — > 3 and 2—^4 
scattering processes 1 [1-13]. Very recently, even dominant corrections to a 2 — > 5 process 
have been computed [14]. When NLO predictions are available, theoretical uncertainties 
are reduced compared to Born level predictions, and more accurate comparisons with 
experimental data become possible. However, NLO predictions describe the effect due to 
at most one additional parton in the final state. This is quite far from realistic LHC events, 
which involve a large number of particles in the final state. For infrared safe, sufficiently 
inclusive observables, NLO calculations provide accurate predictions, but this is not the 
case for more exclusive observables that are sensitive to the complex structure of LHC 
events. 

A complementary approach is provided by parton shower event generators, that gen- 
erate realistic hadron-level events, but only with leading logarithmic accuracy. In recent 
years, methods that include the benefits of a NLO calculation together with a parton 
shower model (an NLO+PS generator, from now on) have become available. Using these 
methods one can thus generate exclusive, realistic events, maintaining NLO accuracy for 
inclusive observables. Two NLO+PS frameworks are being currently used in hadron col- 
lider physics: MC@NL0 [15] and POWHEG [16,17]. In the past few years a number of processes 
have been implemented in both frameworks. However, most processes included so far are 

x As usual, in this counting we do not include the decay of heavy particles. 
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a) b) 

Figure 1: Sample diagrams for a) QCD and b) electroweak production mechanisms of W + W + 
plus dijet production. 

relatively simple 2 — >• 1 or 2 — >■ 2 scattering processes, for which the one-loop correction 
can be expressed in a closed, relatively simple analytic form. 2 

No 2—7-4 process has been implemented so far in any NLO+PS framework. Tools to 
tackle processes of arbitrary complexity do however exist. A general computer framework 
for the POWHEG implementation of arbitrary NLO processes has been presented in ref. [21], 
the so called POWHEG BOX. Within this framework, one needs only to provide few ingredients: 
the phase-space and flavour information, the (spin- and colour-correlated) Born, real and 
virtual matrix elements for a given NLO process in order to build a POWHEG implementation 
of it. 

Recent NLO calculations of processes of high multiplicity use numerical methods to 
perform the reduction of tensor integrals "on the fly" or to compute coefficients of master 
integrals in terms of products of tree-level amplitudes. These methods allow the com- 
putation of the virtual corrections to very complex processes. On the other hand, these 
calculations become quite computer intensive. The computation of real radiation correc- 
tions also requires a considerable CPU time, since one needs to integrate over a phase space 
of high dimension. For instance, if n on shell particles are produced at Born level, the real 
radiation term involves an integration over 3n + 1 variables. 

In this paper we present a POWHEG BOX implementation of the QCD production of 
W + W + +2 jets, including the leptonic decay of the W bosons with spin correlations. This is 
the first time that a2->4 process has been implemented in an NLO+PS framework. NLO 
QCD corrections to W + W + production have been computed recently using D-dimensional 
unitarity [12]. The production of a W + W + pair in hadronic collisions requires the pres- 
ence of two jets in the final state. Thus, in spite of the presence of the these jets, there 
are no collinear or soft divergences at the Born level. As such, the process presents no 
complications due to the need of a generation cut [23]. However, since the NLO calculation 
is computer intensive, a number of technical issues arise in POWHEG that are not present for 
simpler processes. We discuss some of these issues in the present work, and find acceptable 
solutions for some of them. We also show that further efficiency improvements are possible, 
thus paving the road for the matching of NLO calculations and parton showers for yet more 
complex processes. 

We consider in this work only the QCD production mechanism of the W + W + +2 jets 
final state, i.e. the process involving one gluon exchange and the direct emission of the 

2 Two noticeable exceptions are the POWHEG implementations of two 2 — > 3 processes: vector boson fusion 
Higgs production [18] , and top pair production in association with one jet [19] . 
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W + pair from quark lines (see fig. la). A different production mechanism is given by the 
electroweak (EW) scattering process, where a colourless vector boson is exchanged in the 
t-channel (see fig. lb). This second mechanism, although of higher order in the electroweak 
coupling constant, is only moderately smaller than the QCD one. We leave the POWHEG 
implementation of the EW production process, for which NLO QCD corrections are also 
known [22], to a future publication. 

The VF + Ty + +2 jets process has a considerable phenomenological interest. It's LHC 
cross section, including the branching ratios to electrons and muons, is around 6 fb at 7 TeV 
and 20 fb at 14 TeV. It has a distinct signature of two same-sign leptons, missing energy 
and two jets. It therefore constitutes an important background to studies of double-parton 
scattering [24], as well as to new physics signatures that involve two same-sign leptons, such 
as R-parity violating SUSY models [25], diquark production with decay of the diquark to 
a pair of top quarks [26] or double charged Higgs production [27]. This work will make it 
possible to have a more reliable generator of this SM background in those physics studies, 
that currently use only LO (Leading Order) shower Monte Carlo programs. 

The remainder of this paper is organized as follows. In section 2 we discuss the process 
under study. In section 3 we present few technical issues related to the implementation of 
the process in the POWHEG BOX [21] (more details are given in appendix A). In section 4 
we present physical results for some kinematic distributions. We pay particular attention 
to where NLO results differ mostly from the POWHEG ones. We draw our conclusions and 
outlook in section 5. 

2. W + W + plus dijet production 

The process W + W + + 2 jets has been computed at NLO in [12], and we fully use those 
results here. In this section we recall few aspects of the calculation and refer the reader 
to [12] for all other details. 

For the one-loop calculation one expresses one-loop amplitudes as a linear combination 
of master integrals and uses D-dimensional unitarity to compute the coefficients of the 
master integrals in this decomposition of the amplitude. The coefficients are then given 
by products of tree-level amplitudes evaluated in higher dimensions and involving complex 
momenta. These tree-level helicity amplitudes are evaluated using recursive Berends-Giele 
relations [28]. This is the most natural choice since recursion relations can be easily used 
to compute amplitudes involving complex momenta in an arbitrary dimension. The master 
integrals are evaluated using the package QCDloop [20]. 

The ingredients needed to implement a new process in the POWHEG BOX are then [21] 

• the list of the flavour structures in the Born and real processes for incoming and 
outgoing particles. Only one flavour structure must appear for each class of flavour 
structures that are equivalent up to a permutation of final state particles. In the 
present case we have 20 flavour structures for the Born and 36 for the real radiation 
contributions; 
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• the Born phase space. In our case the Born phase space involves an integration over 
16 variables (of which one azimuthal angle is irrelevant); 

• the Born, real and virtual squared matrix elements. Furthermore, one needs the Born 
colour and spin-correlated amplitudes. The last one arises only if there are gluons 
as external particles, which is not the case in our process. The colour correlated 
Born amplitudes are available in the code of ref. [12], where they are used in the 
computation of the virtual amplitudes. The matrix elements for each flavour structure 
should be appropriately symmetrized if identical particles appear in the final state; 

• the Born colour structures in the limit of large number of colours. Once the POWHEG 
event kinematics and flavour structure is generated, we must also assign a planar 
colour structure to it, that is needed by the shower program for building the subse- 
quent radiation and to model the hadronization process. In POWHEG this colour as- 
signment is based upon the colour structure of the Born term in the planar limit [21]. 
In the process at hand, we have two possibilities. In the case of two quark pairs of 
distinct flavour, there is at Born level only one diagram, therefore the leading colour 
structure is fixed. In the case of identical fermions there are two diagrams at Born 
level, corresponding to s and t channel scattering. We pick then the leading colour 
structure for each phase space point according to the value of the squared matrix 
element for s and t channel scattering, neglecting the interference term. 

We performed the following checks on the implementation of the NLO calculation: 

• The POWHEG BOX computes internally the soft and collinear limits of the real ampli- 
tude, using only the Born cross section. These are compared to the full real amplitude 
in the soft and collinear limits, and the results of this comparison are written to a 
file. This is a valuable check on the real and Born amplitudes, and is performed 
automatically by the POWHEG BOX. 

• The POWHEG BOX can also be used to compute bare LO and NLO distributions. Using 
this feature, by fixing the same input as in [12], we have verified that we reproduce 
all LO and NLO cross-sections and distributions presented there. 

3. Technical details in POWHEG 

In this section we discuss some technical issues having to do with the POWHEG implementa- 
tion of the W + W + +2 jets process. We assume that the reader has some familiarity with 
the POWHEGmethod. 

At the beginning, the POWHEG BOX computes the integral of the so called B(Xi) func- 
tion. The Xi, that we denote collectively with X, are a set of 3n — 2 variables (where n is 
the number of final state particles in the real emission process, including decay products), 
defined in the unit cube, that parametrize the momentum fraction of the incoming partons 
and the full phase space for real emission. More specifically, the first 3n — 5 variables, 
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denoted collectively as Xb otq , parametrize the underlying Born configuration, while the 
last three variables, denoted by X ra d, parametrize the radiation. The integral 



represents the inclusive cross section for the process in question at fixed underlying Born 
configuration. POWHEG produces events by first generating the underlying Born configura- 
tion, and then generating radiation using a shower technique. 

The B and B function are sum of terms, each term referring to a specific flavour struc- 
ture of the underlying Born. POWHEG first computes the integral and an upper bounding 
envelope of the B(X) function. Using this upper bounding envelope, it is possible to gen- 
erate the X variables with a probability distribution proportional to B{X) using a hit and 
miss technique. The X ra( j values are discarded, and the remaining X-Q otn variables are thus 
generated with a probability proportional to B. The flavour structure of the underlying 
Born configuration is chosen with a probability proportional to each flavour component of 
B{X) at the generated point. The function B itself is the sum of the Born and the virtual 
contributions evaluated at the underlying Born phase space configuration, plus an appro- 
priate combination of the real emission cross section, the soft and collinear subtractions, 
and the collinear remnants from the subtraction of the initial state singularities. This com- 
bination also depends upon the X va ^ variables, while the Born and virtual contributions 
do not. The evaluation of the B function requires a calculation of the total virtual cross 
section for each flavour configuration. It turns out that one evaluation of the B function 
requires a time of the order of 30 seconds. 3 Although this seems to be a fairly long time, 
as long as the problem can be trivially parallelized on a large CPU cluster, it can be dealt 
with. In fact, however, the problem can be parallelized only after the importance sampling 
integration grid has been established. It is thus common practice, in this kind of NLO 
calculations, to build the adaptive integration grid using only the Born contribution. In 
our case, we introduced a switch in our input file, called f akevirt. If this token is set to 
one, the virtual correction is replaced by a term proportional to the Born cross section. 
We thus perform the first integration step, when the adaptive integration grid is formed, 
with this token set, so that no calls to the virtual routines are performed. This way, it 
is not difficult to obtain reasonably looking adaptive grids with 500000 calls to the B 
function, taking about 10 hours of CPU time. The same calculation using the full virtual 
contribution would take a time of the order of 170 days, and would thus be unfeasible. 

After the importance sampling grid has been established, the computation of the inte- 
gral of the B function, and the computation of the upper bounding envelope that is used 
for the generation of the underlying Born configurations, can be performed in parallel. The 
POWHEG BOX already had a mechanism to perform this stage of the computation in parallel 
and to combine all the results. 

We have found that it is convenient to use the so called "folding" technique in the 
integration of the B function. The folding procedure is better explained by an example. 

3 This is the typical time on a 2.4 GHz CPU, if the program is compiled with the if ort compiler. 




(3.1) 
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Given a function f(x) to be integrated by a Monte Carlo technique in the range < x < 1, 
one can replace it by the function 



It is also obvious that the larger m is, the smoother the F function will be, thus requiring 
less points in a Monte Carlo integration. We call this procedure "folding" the x variable. 
In the POWHEG BOX, the radiation variables can be folded individually. 4 In previous works, 
the use of folding was advocated to avoid spurious negative weights. In the present case, 
besides also serving that purpose, folding is used to balance the computer time needed 
for the computation of the real and virtual contribution. In fact, since only the radiation 
variables are folded, the virtual contribution is the same in a given folded set, and thus is 
computed only once. The real contribution is instead computed several times. By using 
this procedure, the B function becomes a smoother function of the radiation variables, so 
that the integration becomes easier to perform, and also the generation of the underlying 
Born configurations becomes more efficient. It is found that with folding numbers of 5, 5, 
10, referring respectively to the radiation variables £, y and cf>, the time required to compute 
the virtual contribution becomes comparable to the time for the evaluation of the real one. 
As we said earlier, by using the Intel Fortran compiler, the time for a single call to the 
virtual cross section is roughly 30 seconds. In order to get 500000 points, one needs about 
170 days of computer time, a relatively easy task on modern days clusters with hundreds 
of CPU's. A further problem arises, however. Assuming that we are using 100 CPU's, 
each process generates 5000 points. This is not enough to get a reasonable upper bounding 
envelope for the generation of the underlying Born configuration. In fact, the procedure 
used in the POWHEG BOX (described in ref. [29]) has no time to reach the upper bound with 
such a small number of points. Even when combining together the upper bounds of the 
different runs, one gets an unacceptable rate of upper bound failures in the generation of 
the underlying Born configuration, of the order of 1 every ten calls to the B function, which 
can thus affect final distributions. We modified the POWHEG BOX basic code, in order to deal 
with this problem. In short, the return values of the B function calls were first written 
to files by the parallel processes, and the upper bounding envelope was later evaluated 
by reading all the generated files. After this step, the program is capable of generating 
user process events (that is to say, events ready to be fed through a shower Monte Carlo 
program). The efficiency, however, turns out to be very small, of the order of 2%. This 
means that the generation time is of the order of 30/0.02 = 1500 seconds, roughly a couple 
of events per hour. In order to collect 100000 events, we would thus need 500 hours on 

4 In fact, rather than the -X ra d variables, what is folded are the corresponding variables, piecewise linear 
functions of the X ra( j, that have constant importance sampling in the adaptive grid. 
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also defined for < y < 1. It is obvious that 




(3.3) 
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a typical 100 CPU's cluster. We were able to reach 15% efficiency by a further technical 
modification to the POWHEG BOX code which is described in appendix A. 

At this point, no further problems arise. One can generate the upper bound normal- 
ization for the generation of radiation, and start the event generation using a parallel CPU 
cluster. The upper bound failures in the generation of the underlying Born configurations 
are at the level of 2 for 1000 generated events. Due to the large folding numbers, the 
fraction of negative weighted events is only 0.4%, an acceptable value. The generation 
time is about three minutes per event. Most of the generation time is consumed by the 
computation of B. Since the efficiency in the generation of the underlying Born is of the 
order of 10%, the generation time is of the order of several times the time needed for the 
computation of a single virtual point. Again, having a large CPU cluster at one's disposal, 
it is not hard to generate few hundred thousands events. 

It is also worth asking whether other improvements in performance are actually pos- 
sible. Aside from considering more aggressive hardware requirements, like using GPU's 
and the like, we have immediately noted another speed aspect that can be improved by 
modifying suitably the POWHEG BOX code. In fact, at the moment, we compute the full B 
function when we generate the underlying Born kinematics, and decide its flavour structure 
on the basis of the size of each flavour contribution to it. This is what was implemented 
in the POWHEG BOX, mainly for reasons of simplicity. A more efficient approach would be 
to store sufficient information to generate each underlying Born flavour configuration in- 
dividually. In this way, the generation process would start by picking an underlying Born 
flavour configuration with probability proportional to the corresponding contribution to 
the total cross section. Given the underlying Born flavour configuration, one would then 
generate the underlying Born phase space. It is not unlikely that, with this approach, one 
may gain a factor of order 10 in speed. 

Alternatively, a speed gain may be achieved if the code that computes the virtual 
contribution is optimized to compute all flavour structure contributions to the virtual 
cross section at once. 

4. Results 

In this section we present our results. We consider proton-proton collisions with center-of- 
mass energy ^/s = 7 TeV. We require the W + bosons to decay leptonically into e + ii + v e v^. 
The lU-bosons are produced on mass-shell and we assume a diagonal CKM matrix. Neglect- 
ing interference effects, which are numerically suppressed since they force the W bosons 
off mass-shell, the cross-section for same-flavour production is half that of different flavour. 
This implies that the full cross-section summing over electrons and muons can be obtained 
by multiplying the results presented here by a factor two. 

The setup used is largely inspired by [12], but we consider here a different center of 
mass energy. In the distributions shown, we impose the following leptonic cuts 

p u+ > 20 GeV, p t , miss > 30 GeV, \ m+ \ < 2.4 . (4.1) 
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We define jets using the anti-/cj_ algorithm [30], with the R parameter set to 0.4. We do not 
impose any transverse-momentum cut on the two outgoing jets, nor do we impose lepton 
isolation cuts. 

The mass of the W-boson is taken to be raw = 80.419 GeV, the width Tw = 
2.141 GeV. W couplings to fermions are obtained from «qed("Zz) = 1/128.802 and 
sin 2 6w = 0.2222. We use MSTW08NLO parton distribution functions, corresponding 
to a s (Mz) = 0.12018 [31]. We consider the top quark infinitely heavy and neglect its 
effects, while all other quarks are treated as massless. We set the factorization scale equal 
to the renormalization scale, which we choose to be 

„ _„ _ Pt, P i +Pt, P 2 + E t>Wl + E t;W2 r— 

where pt,Wi Pt,Wn Pt,pi an d Pt,p2 an d are the transverse momenta of the two W + and of the 
two emitted partons in the underlying Born configuration. We use PYTHIA 6.4.21 [32] to 
shower the events, include hadronization corrections and underlying event effects, with the 
Perugia tune (i.e. we call PYTUNE(320) before calling PYINIT). We have also showered 
the events using HERWIG [33]. We found only marginal differences between the HERWIG and 
PYTHIA results. Thus, we do not show any plot of HERWIG results. 

We remind the reader that we consider here only the QCD production of W + W + jj, 
while we completely neglect the electroweak production. We also stress that we have 
not computed any theoretical error due to scale variations or PDF uncertainties on our 
distributions. Thus, our error bars are only statistical. The purpose of the plots we show 
is only to validate our results. Since the code is public, a user may study theoretical 
uncertainties at will. 

With the setup described above, we obtain an NLO total cross-section of 2.74±0.03 fb, 
that coincides by construction with the P0WHEG+PYTHIA result. If we impose the leptonic 
cuts of eq. (4.1), and do not require any minimum transverse momentum for the jets, 
we have a NLO cross section of 1.11 ± 0.01 fb, and a slightly lower cross section with 
P0WHEG+PYTHIA of 1.06 ± 0.01 fb. Unless otherwise stated, these are the cuts applied to 
the distributions presented in the following. If we also require to have at least two jets 
with transverse momentum larger than 30 GeV we obtain an NLO inclusive cross-section 
of 0.84 ±0.01 fb. A 30 GeV transverse momentum cut was also applied to the third hardest 
jet, when we plot its rapidity distribution. 

We now discuss some kinematical distributions. We first consider leptonic inclusive 
distributions. We plot in fig. 2 the inclusive transverse momentum distribution for the 
charged lepton (e + or p t i+ and its rapidity distribution yi+, the missing transverse 
momentum, the charged lepton system invariant mass m e + ll +, and the transverse mass of 
the two W bosons defined as 

m T,WW = ( E T,e+n+ + ^T,miss) 2 ~ (Pt,e+^+ + Pt.miss) 2 , (4.3) 

where the missing transverse energy -Et miss is reconstructed from the missing transverse 



momentum using the invariant mass of the charged lepton system £>r,miss = J pf miss + m e +n+ 



-8- 






Figure 2: Leptonic kinematic distributions for the QCD production of pp — > e + v e +2 jets 
at next-to-leading order and with POWHEG+PYTHIA. See text for more details. 



For these distributions we find good agreement between NLO and POWHEG+PYTHIA, and do 
not observe any relevant difference in shape. 

In fig. 3 we show some hadronic inclusive distributions. We plot the transverse mo- 
mentum and rapidity distributions of the two leading jets, i.e. those with largest transverse 
momentum, and the total transverse energy of the event i^T,TOT) defined as 



-Ht,TOT = Pt,e+ +Pt,/x+ +Pt,miss + ^PtJ , 

j 



(4.4) 



1e-2 




1e-1 



100 200 300 400 
Ptji [GeV] 



500 600 700 




200 300 
Pt,j2 [GeV] 



500 






H T T0T [GeV] 



Figure 3: Hadronic kinematic distributions for the QCD production of pp e + /i + v e +2 jets 
at next-to-leading order and with POWHEG+PYTHIA. See text for more details. 



where the sum runs over all jets in the event. For the transverse momentum and rapidity 
distributions we notice differences of the order of 10% between the NLO and POWHEG+PYTHIA 
results. We also find that the POWHEG+PYTHIA distributions tend to be more peaked for 
smaller jet transverse momenta, and also that jets tend to be slightly more central. 

The i/r,TOT distribution, on the other hand, displays large differences, especially on 
the first bin, where the POWHEG+PYTHIA result is a factor 2 smaller that the NLO one. This 
feature is easily explained. The shower and the underlying event generated by PYTHIA adds 
several soft particles to the event. Since we do not apply any transverse momentum cut, 
these soft particles are clustered in jets, and contribute positively to -Ht,tot- Because of 
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Figure 4: On the left we show a comparison of the NLO and bare PDWHEG distribution. On the 
right we show the POWHEG+PYTHIA and the NLO i?T,TOT distribution when only the three hardest 
jets are considered in the computation. 

the large multiplicity of LHC events, assuming a typical transverse momentum of 500 MeV 
for soft hadrons, we see that it is not inconceivable that this increase may reach values of 
the order of 50 GeV. The first bin of the distribution is the most affected one, because this 
mechanism can cause events to migrate to higher bins from it, while no event will migrate 
backward. This explanation is easily tested. First of all, we show in the left plot of fig. 4 
that the P0WHEG user process event, without PYTHIA shower, is in good agreement with the 
NLO result. We see that, if anything, it is the P0WHEG distribution that is slightly above the 
NLO one. This proves that this feature is not originated by the P0WHEG implementation. 
In the right plot, we show a comparison of the POWHEG+PYTHIA and the NLO calculation 
for the -f^T,TOT distribution, this time defined to involve only the three hardest jets. The 
NLO distribution, of course, is not affected. On the other hand, the POWHEG+PYTHIA is 
brought in much better agreement with the NLO calculation. 

Finally, we show in fig. 5 the transverse momentum and rapidity of the third jet (in 
the last distribution we impose a transverse momentum cut of 30 GeV on the jets), and 
the relative transverse momentum distribution of the particles inside the two leading jets 
defined with respect to the jet axis in the frame where the jet has zero rapidity 

^=E%f- (4-5) 

Here k{ denotes the momentum of the z th particle and pj of the j th jet. At NLO it is 
only the real radiation that contributes to these four distributions, and we see clearly a 
divergence at small Pt,j3 and Ptreijj while in the POWHEG+PYTHIA prediction the distribution 
has a Sudakov peak and goes to zero for Ptj3> Ptrel,] — ^ 0. We also remark that the third 
jet tends to be more central with POWHEG+PYTHIA. 

5. Conclusions 

In this work we have presented a P0WHEG implementation for the QCD production of pp —> 
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Figure 5: Hadronic kinematic distributions for the QCD production of pp - 
at next-to-leading order and with POWHEG+PYTHIA. Sec text for more details. 



2 jets 



plus two jets, at NLO in the strong coupling constant, with W leptonic decays 
included with NLO accurate spin correlations. The NLO corrections for this process have 
been computed recently using D-dimensional unitarity in ref. [12]. In this work we just 
focused upon building up a P0WHEG implementation, in order to consistently interface the 
calculation to shower Monte Carlo generators. The PDWHEG implementation was built in 
the framework of the P0WHEG BOX [21]. 

The pp — > W + W + 2j process is of considerable phenomenological interest, being an 
important background to new physics signatures, and to the study of double parton scat- 
tering phenomena. Furthermore, its study is also interesting since it represents a first 
PDWHEG implementation of a complex, 2 — > 4 scattering process, where the calculation of 
the virtual corrections is highly demanding from a computational point of view. Besides 
this issue, the P0WHEG implementation of this process does not present any special problem. 
The Born cross section is finite, in spite of the presence of the two jets in the final state, so, 
from this point of view the process is similar to Higgs boson production in Vector Boson 
Fusion [18]. However, the large amount of computer time required for the calculation of 
the virtual contributions has in practice turned out into a difficult problem to deal with, 
so that the P0WHEG BOX implementation of the process was in fact not completely trivial. 

We have spotted a number of possible improvements to the P0WHEG BOX code that can 
result in a substantial increase in efficiency, and we have implemented the most simple 
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ones. We were thus able to generate an adequate number of events for this process and, 
most importantly, we have convinced ourselves that the POWHEG BOX efficiency can be 
increased even further, in order to match the level of complexity that is now possible in 
NLO calculations. 

We have compared the POWHEG BOX result, interfaced with the PYTHIA and HERWIG 
Monte Carlo, with the bare NLO one, and have found consistency with the features observed 
in other implementations: very inclusive observables, like the lepton spectra, display a 
remarkable agreement; quantities involving leading jets also agree well, with only minor 
differences; quantities involving the radiated jet display marked differences in the Sudakov 
region. 

Finally, we have made our code public. It can be retrieved by following the instructions 
at the POWHEG BOX web site http://powhegbox.mib.infn.it. 
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A. Raising the generation efficiency 

Due to the large amount of computer time needed to compute virtual corrections, it is 
mandatory to increase the generation efficiency for the underlying Born configurations. 
For processes with many external legs, this efficiency can be in fact quite low. 

The POWHEG BOX generates the underlying Born and radiation kinematics using a hit 
and miss technique. An upper bounding envelope of the B function is found, of the form 

n 

£(x)<n/ (i) PQ), (a.i) 
i=i 

where the are the integration variables, and the /W functions are step functions of the 
integration variables. The size of the step is determined by the importance sampling grid 
itself, as documented in ref. [29]. In order to generate a configuration, the points Xi are 
first generated with a probability distribution equal to f^(Xi). Then a uniform random 
number r, with 

n 

0<r<n/ W (^) (A.2) 

i=i 

is generated. One then computes B(X). If r < B(X) we have a hit, and the configuration 
is kept. Otherwise the configuration is rejected (we have a miss), and we restart the 
procedure. 

It is clear that, if the number of integration variables is large (as in our case), an upper 
bound of the form eq. (A.I) will generally be highly inefficient, just because the product 
of a large number of terms will tend to build up large values. In order to remedy to this 
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problem, we have exploited the fact that the B function is equal to the Born cross section 
plus higher order terms. It is thus natural to expect that an upper bound of the form 

n 

B(X)<B(X)xHg^(X l ), (A.3) 
i=i 

will be much closer to it. We thus determine the g functions for this bound, using the same 
technique used for the / functions. Then we modified our code in such a way that, before 
computing B in order to test for a hit or miss, we compute the right hand side of eq. (A.3). 
If it is smaller than r, B will also be smaller than r, and we thus know that we have a miss 
without the need to compute the time consuming B function. By adopting this method, 
we have reached an efficiency of 15%, instead of a 1-2% efficiency that we achieve with the 
PDWHEG BOX default method. 
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